vimfandomcom-20200223-history
Working with CSV files
CSV files (comma-separated values) are often used to save tables of data in plain text. Following are some useful techniques for working with CSV files. You can: *Highlight all text in any column. *View fields (convert csv text to columns or separate lines). *Navigate using the HJKL keys to go left, down, up, right by cell (hjkl work as normal). *Search for text in a specific column. *Sort lines by column. *Delete a column. Highlighting a column It's easy to find a column in csv text if you source the following script. Enter a command like :Csv 23 to highlight column 23. " Highlight a column in csv text. " :Csv 1 " highlight first column " :Csv 12 " highlight twelfth column " :Csv 0 " switch off highlight function! CSVH(colnr) if a:colnr > 1 let n = a:colnr - 1 execute 'match Keyword /^\(^,*,\)\{'.n.'}\zs^,*/' execute 'normal! 0'.n.'f,' elseif a:colnr 1 match Keyword /^^,*/ normal! 0 else match endif endfunction command! -nargs=1 Csv :call CSVH() Highlighting the header row If the first row is a header row, you can highlight the header fields by putting the following lines in the file syntax/csv.vim (see [[VimTip1565#Syntax highlighting|syntax highlighting for moin]]). This requires csv file type detection (see ''Usage'' below). syntax match csvHeading /\%1l\%(\%("\zs\%(^"\|""\)*\ze"\)\|\%(\zs^,"*\ze\)\)/ highlight def link csvHeading Type Viewing csv fields The following commands can be entered to convert csv text to columns for easy viewing. Work on a temporary copy of your data because these commands will damage it! " Convert csv text to columns (press u to undo). " Warning: This deletes ',' and crops wide columns. :let width = 20 :let fill = repeat(' ', width) :%s/\(^,*\),\=/\=strpart(submatch(1).fill, 0, width)/ge :%s/\s\+$//ge Alternatively, you can change each comma to a newline to put each field on its own line: " Change CSV fields on current line to a list of separate items. :s/,/\r/g " Same, for all lines. :%s/,/\r/g In the replace text of a substitute, \r substitutes a newline. Navigating in csv text The simple script given above does not provide easy navigation, and it assumes that commas are only used as delimiters. The following code has more features, and it is presented as a plugin due to its length. Features *Fields are correctly highlighted, according to the CSV specification: quotes, commas inside quotes, quote inside quotes are all correctly processed. *It does not go beyond the last column (from a count of the columns in the first and last three lines). *HJKL go left, down, up, right by "cell". Focus is set to the first character of the cell. *0 and $ highlight the first and last cell. Focus is set to the first character of the cell. *Ctrl-f, Ctrl-b page forward and back, while staying in the same column. *\J is Join and \K is Keyword (assuming the default \ local leader key ). *The column number and heading (from the first line in the buffer) are displayed when moving around. *Search within column. The command :SC n=str will search for str in the n-th column. If "n=" is omitted, the search is within the currently highlighted column. For example: :SC 2=john " search for john in the 2nd column only :SC john " search for john in the currently highlighted column Case sensitivity is the same as for / (toggle with :set ic! ic? see ). After the search, the @/ variable is set. So, for example, after :SC 2=john, one can use g//d to delete all lines whose second field contains john. *Sort lines by column. The command :Sort n will sort by the n-th column in increasing order. If n is omitted, then sort by the current highlighted column. *Delete column. The command :DC n will delete the n-th column. If n is omitted, then the current highlighted column is deleted. Usage Create file ~/.vim/ftplugin/csv.vim (Unix) or $HOME/vimfiles/ftplugin/csv.vim (Windows) containing the script below. Define file type detection for *.csv files (see [[VimTip1565#File type detection|file type detection for moin]]): autocmd BufNewFile,BufRead *.csv setf csv Open a file (named anything.csv) that contains fields separated by commas. Use H J K L 0 $ to move from cell to cell. If you open a *.csv file, the current column will be highlighted. Conversely, you may open a file with csv data, yet the file is not named *.csv, so highlighting will not occur. You can switch highlighting off by setting the filetype option to nothing: :set ft= This command switches highlighting on: :set ft=csv The code This is the ftplugin csv.vim script: " Filetype plugin for editing CSV files. " From http://vim.wikia.com/wiki/VimTip667 if v:version < 700 || exists("b:did_ftplugin") finish endif let b:did_ftplugin = 1 " Display a warning message. function! s:Warn(msg) echohl WarningMsg echo a:msg echohl NONE endfunction " Get the number of columns (maximum of number in first and last three " lines; at least one of them should contain typical csv data). function! s:GetNumCols() let b:csv_max_col = 1 for l in 2, 3, line('$') - 2, line('$') - 1, line('$') " Determine number of columns by counting the (unescaped) commas. " Note: The regexp may also return unbalanced ", so filter out anything " which isn't a comma in the second pass. let c = strlen(substitute(substitute(getline(l), '\%(\%("\%(^"\|""\)*"\)\|\%(^,"*\)\)', , 'g'), '"', '', 'g')) + 1 if b:csv_max_col < c let b:csv_max_col = c endif endfor if b:csv_max_col <= 1 let b:csv_max_col = 10000 call s:Warn("No comma-separated columns were detected.") endif return b:csv_max_col endfunction " Return regex to find the n-th column. function! s:GetExpr(colnr, ...) if a:0 0 " field only let field = '\%(\%("\zs\%(^"\|""\)*\ze"\)\|\%(\zs^,"*\ze\)\)' else " field with quotes (if present) and trailing comma (if present) let field = '\%(\%(\zs"\%(^"\|""\)*",\?\ze\)\|\%(\zs^,"*,\?\ze\)\)' endif if a:colnr > 1 return '^\%(\%(\%("\%(^"\|""\)*"\)\|\%(^,"*\)\),\)\{' . (a:colnr - 1) . '}' . field else return '^' . field endif endfunction " Extract and echo the column header on the status line. function! s:PrintColInfo(colnr) let colHeading = substitute(matchstr(getline(1), s:GetExpr(a:colnr)), '^\s*\(.*\)\s*$', '\1', '') let info = 'Column ' . a:colnr if empty(colHeading) echo info else echon info . ': ' echohl Type " Limit length to avoid "Hit ENTER" prompt. echon strpart(colHeading, 0, (&columns / 2)) . (len(colHeading) > (&columns / 2) ? "..." : "") echohl NONE endif endfunction " Highlight n-th column (if n > 0). " Remove previous highlight match (ignore error if none). " matchadd() priority -1 means 'hlsearch' will override the match. function! s:Highlight(colnr) silent! call matchdelete(b:csv_match) if a:colnr > 0 if exists('*matchadd') let b:csv_match = matchadd('Keyword', s:GetExpr(a:colnr), -1) else execute '2match Keyword /' . s:GetExpr(a:colnr) . '/' endif if b:changed_done != b:changedtick let b:changed_done = b:changedtick call s:GetNumCols() endif call s:Focus_Col(a:colnr) endif endfunction " Focus the cursor on the n-th column of the current line. function! s:Focus_Col(colnr) normal! 0 call search(s:GetExpr(a:colnr), '', line('.')) call s:PrintColInfo(a:colnr) endfunction " Highlight next column. function! s:HighlightNextCol() if b:csv_column < b:csv_max_col let b:csv_column += 1 endif call s:Highlight(b:csv_column) endfunction " Highlight previous column. function! s:HighlightPrevCol() if b:csv_column > 1 let b:csv_column -= 1 endif call s:Highlight(b:csv_column) endfunction " Wrapping would distort the column-based layout. " Lines must not be broken when typed. setlocal nowrap textwidth=0 " Undo the stuff we changed. let b:undo_ftplugin = "setlocal wrap< textwidth<" \ . "|if exists('*matchdelete')|call matchdelete(b:csv_match)|else|2match none|endif" \ . "|sil! exe 'nunmap H'" \ . "|sil! exe 'nunmap L'" \ . "|sil! exe 'nunmap J'" \ . "|sil! exe 'nunmap K'" \ . "|sil! exe 'nunmap '" \ . "|sil! exe 'nunmap '" \ . "|sil! exe 'nunmap 0'" \ . "|sil! exe 'nunmap $'" \ . "|sil exe 'augroup csv' . bufnr()" \ . "|sil exe 'au!'" \ . "|sil exe 'augroup END'" let b:changed_done = -1 " Highlight the first column, but not if reloading or resetting filetype. if !exists('b:csv_column') let b:csv_column = 1 endif " Following highlights column and calls GetNumCols() if set filetype manually " (BufEnter will also do it if filetype is set during load). silent call s:Highlight(b:csv_column) " Compare two lines based on the highlighted column. function! s:CompareLines(line1, line2) let col1 = matchstr(a:line1, s:GetExpr(b:csv_column_sort)) let col2 = matchstr(a:line2, s:GetExpr(b:csv_column_sort)) return col1 > col2 endfunction " Sort the n-th column, the highlighted one by default. function! s:SortCol(colnr) if a:colnr "" let b:csv_column_sort = b:csv_column else let b:csv_column_sort = str2nr(a:colnr) endif if b:csv_column_sort < 1 || b:csv_column_sort > b:csv_max_col call s:Warn("column number out of range.") endif call setline(2, sort(getline(2, '$'), function("s:CompareLines"))) endfunction command! -buffer -nargs=? Sort execute SortCol("") " Delete the n-th column, the highlighted one by default. function! s:DeleteCol(colnr) if a:colnr "" let col = b:csv_column else let col = str2nr(a:colnr) endif if col < 1 || col > b:csv_max_col call s:Warn("column number out of range.") endif exec '%s/'.s:GetExpr(col, 1).'//' if col b:csv_max_col silent %s/,$//e endif let b:csv_max_col -= 1 if b:csv_column > b:csv_max_col call s:HighlightPrevCol() endif endfunction command! -buffer -nargs=? DC execute DeleteCol("") " Search the n-th column. Argument in n=regex form where n is the column " number, and regex is the expression to use. If "n=" is omitted, then " use the current highlighted column. function! s:SearchCol(colexp) let regex = '\%(\(1-90-9*\)=\)\?\(.*\)' let colstr = substitute(a:colexp, regex, '\1', ) let target = substitute(a:colexp, regex, '\2', '') if colstr "" let col = b:csv_column else let col = str2nr(substitute(a:colexp, '=.*', '', '')) if col < 1 || col > b:csv_max_col call s:Warn("column number out of range") endif endif if col 1 let @/ = '^\%(\%("\%(^,\|""\)*\zs'.target.'\ze\%(^,\|""\)*"\)\|\%(^,"*\zs'.target.'\ze^,"*\)\)' else let @/ = '^\%(\%(\%("\%(^"\|""\)*"\)\|\%(^,"*\)\),\)\{' . (col-1) . '}\%(\%("\%(^,\|""\)*\zs'.target.'\ze\%(^,\|""\)*"\)\|\%(^,"*\zs'.target.'\ze^,"*\)\)' endif endfunction " Use :SC n=string to search for string in the n-th column command! -buffer -nargs=1 SC execute SearchCol("")|normal! n nnoremap H :call HighlightPrevCol() nnoremap L :call HighlightNextCol() nnoremap J :call Focus_Col(b:csv_column) nnoremap K :call Focus_Col(b:csv_column) nnoremap :call Focus_Col(b:csv_column) nnoremap :call Focus_Col(b:csv_column) nnoremap :call Focus_Col(b:csv_column) nnoremap :call Focus_Col(b:csv_column) nnoremap 0 :let b:csv_column=1:call Highlight(b:csv_column) nnoremap $ :let b:csv_column=b:csv_max_col:call Highlight(b:csv_column) nnoremap gm :call Focus_Col(b:csv_column) nnoremap J J nnoremap K K " The column highlighting is window-local, not buffer-local, so it can persist " even when the filetype is undone or the buffer changed. execute 'augroup csv' . bufnr() autocmd! " These events only highlight in the current window. " Note: Highlighting gets slightly confused if the same buffer is present in " two split windows next to each other, because then the events aren't fired. autocmd BufLeave silent call Highlight(0) autocmd BufEnter silent call Highlight(b:csv_column) augroup END Explanation of the regular expression The above code is fairly easy to understand, except perhaps for the regular expressions. The following regex (inspired by RegExLib 1520) is used several times: \%(\%("\zs\%(^"\|""\)*\ze"\)\|\%(\zs^,"*\ze\)\) It is explained as follows: 1. \%( # unbackref'ed grouping for one CSV field 2. \%( # first possibility of a field 3. " # starts and ends with " 4. \zs # beginning of matched string 5. \%( 6. ^"\|"" # anything not ", or a "" (escaped quote) 7. \)* 8. \ze # end of matched string 9. " 10. \) # end first possibility of field 11. \| # or 12. \%( # second possibility 13. \zs 14. ^,"* # any thing but " and comma 15. \ze 16. \) 17. \) Working with Excel xls files One can use the xls2csv Perl script to convert Excel files to CSV, and then view/edit with Vim. You may put the following code into .vimrc (or _vimrc for Windows) autocmd BufReadPre *.xls set ro | setf csv autocmd BufReadPost *.xls silent! %!xls2csv -q -x "%" -c - autocmd BufReadPost *.xls redraw It will view the XLS file in readonly mode as a CSV file by first converting on-the-fly. Comments In other localisations, ";" is used instead of "," in csv files, because "," is for decimals. Tab is also sometimes used. There should be a way to set the column separator character.